I would like to find out whether crimes are more likely to happen around subway stations. To do this, I will need a data set with the list of reported crimes in Boston and subway stop location data. Then, I will create a buffer around each subway stop and do a buffer spatial analysis. In this case, a buffer analysis is great because we can look for clusters around a station that tell us if the station is the source of those crimes.
Boston Crime Data 2022: https://data.boston.gov/dataset/crime-incident-reports-august-2015-to-date-source-new-system/resource/313e56df-6d77-49d2-9c49-ee411f10cf58
MBTA subway stop: https://mbta-massdot.opendata.arcgis.com/datasets/MassDOT::mbta-systemwide-gtfs-map/explore?layer=0&showTable=true
Here, we are importing libraries that we will need to do our spatial analysis.
# sf library to handle shp file and sf object
library(sf)
# dplyr library for data cleaning functions
library(dplyr)
# tmap library for map plot and projecting
library(tmap)
## Warning: package 'tmap' was built under R version 4.1.3
To keep our analysis simple, we are only focus crimes that happens on February 2022. Since the data set is taken place in 2022, We will filter out the data to include crime data happened on February. It is also an good idea to get rid of data that has no location, which appears Lat and Long as 0. In order to project it onto a map, we need to convert our filtered data into spatial data by using st_as_sf() function. Once that is done, we can plot the data as dots onto an interactive map.
# Setting our tmap to be interactive
tmap_mode("view")
## tmap mode set to interactive viewing
# Importing Boston Crime 2022 Data
crime <- read.csv("bostonCrime2022.csv")
# filter crime data that only happened on February
crimeFeb <- crime %>% filter(MONTH == 2) %>%
#filter out crimes that has no location
filter(Lat > 0)
# converting to sf data
crimeSF <- st_as_sf(crimeFeb, coords = c("Long", "Lat"), crs = "WGS84")
# Projecting crime data onto an interactive map
tm_shape(crimeSF) + tm_dots()